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Abstract. We develop a method for measuring homology classes. This involves three 
problems. First, we define the size of a homology class, using ideas from relative homology. 
Second, we define an optimal basis of a homology group to be the basis whose elements' 
size have the minimal sum. We provide a greedy algorithm to compute the optimal basis 
and measure classes in it. The algorithm runs in 0(f3^n^ \og^ n) time, where n is the size 
of the simplicial complex and f3 is the Betti number of the homology group. Third, we 
discuss different ways of localizing homology classes and prove some hardness results. 



1. Introduction 

The problem of computing the topological features of a space has recently drawn much 
attention from researchers in various fields, such as high-dimensional data analysis [31 [15], 
graphics [I3l[5], networks [lOj and computational biology [HIE]. Topological features are 
often preferable to purely geometric features, as they are more qualitative and global, and 
tend to be more robust. If the goal is to characterize a space, therefore, features which 
incorporate topology seem to be good candidates. 

Once we are able to compute topological features, a natural problem is to rank the 
features according to their importance. The significance of this problem can be justified 
from two perspectives. First, unavoidable errors are introduced in data acquisition, in the 
form of traditional signal noise, and finite sampling of continuous spaces. These errors may 
lead to the presence of many small topological features that are not "real", but are simply 
artifacts of noise or of sampling [19j. Second, many problems are naturally hierarchical. 
This hierarchy - which is a kind of multiscale or multi-resolution decomposition - implies 
that we want to capture the large scale features first. See Figure p^(a)] and p^(b) for examples. 

The topological features we use are homology groups over Z2, due to their ease of 
computation. (Thus, throughout this paper, all the additions are mod 2 additions.) We 
would then like to quantify or measure homology classes, as well as collections of classes. 
Specifically, there are three problems we would like to solve: 

(1) Measuring the size of a homology class: We need a way to quantify the size 
of a given homology class, and this size measure should agree with intuition. For 

the measure should be able to distinguish the one large class 



example, in Figure l(a 



(of the 1-dimensional homology group) from the two smaller classes. Furthermore, 
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Figure 1: (a,b) A disk with three holes and a 2-handled torus are reahy raore hke an annulus and 
a 1-handled torus, respectively, because the large features are more important, (c) A 
topological space formed from three circles, (d) In a disk with three holes, cycles zi and 
Z2 are well- localized; zs is not. 



(2) 



(3) 



the measure should be easy to compute, and applicable to homology groups of any 
dimension. 

Choosing a basis for a homology group: We would like to choose a "good" 
set of homology classes to be the generators for the homology group (of a fixed 
dimension). Suppose that is the dimension of this group, and that we are using 
Z2 coefficients; then there are 2^ — 1 nontrivial homology classes in total. For a 
basis, we need to choose a subset of (3 of these classes, subject to the constraint 
that these (3 generate the group. The criterion of goodness for a basis is based 
on an overall size measure for the basis, which relies in turn on the size measure 
for its constituent classes. For instance, in Figure l(c)[ we must choose three from 
the seven nontrivial 1-dimensional homology classes: {[zi]^ [^2], [^3], [^1] + [^2], [^1] + 
[^3], [^2] + [^3], [^1] + [^2] + [^3]}- In this case, the intuitive choice is {[zi], [^2], [^3]}, 
as this choice reflects the fact that there is really only one large cycle. 
Localization: We need the smallest cycle to represent a homology class, given a 
natural criterion of the size of a cycle. The criterion should be deliberately chosen 
so that the corresponding smallest cycle is both mathematically natural and intu- 
itive. Such a cycle is a "well-localized" representative of its class. For example, in 
Figure 1(d), the cycles zi and Z2 are well-localized representatives of their respective 



homology classes; whereas z^ is not. 
Furthermore, we make two additional requirements on the solution of aforementioned prob- 
lems. First, the solution ought to be computable for topological spaces of arbitrary dimen- 
sion. Second the solution should not require that the topological space be embedded, for 
example in a Euclidean space; and if the space is embedded, the solution should not make 
use of the embedding. These requirements are natural from the theoretical point of view, 
but may also be justified based on real applications. In machine learning, it is often assumed 
that the data lives on a manifold whose dimension is much smaller than the dimension of 
the embedding space. In the study of shape, it is common to enrich the shape with other 
quantities, such as curvature, or color and other physical quantities. This leads to high 
dimensional manifolds (e.g, 5-7 dimensions) embedded in high dimensional ambient spaces 

PI- 

Although there are existing techniques for approaching the problems we have laid out, 
to our knowledge, there are no definitions and algorithms satisfying the two requirements. 
Ordinary persistence ^ provides a measure of size, but only for those inessential 
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classes, i.e. classes which ultimately die. More recent work [7] attempts to remedy this 
situation, but not in an intuitive way. Zomorodian and Carlsson [2T] use advanced algebraic 
topological machinery to solve the basis computation and localization problems. However, 
both the quality of the result and the complexity depend strongly on the choice of the given 
cover; there is, as yet, no suggestion of a canonical cover. Other works like [HI [191 [H] are 
restricted to low dimension. 

Contributions. In this paper, we solve these problems. Our contributions include: 

• Definitions of the size of homology classes and the optimal homology basis. 

• A provably correct greedy algorithm to compute the optimal homology basis and 
measure its classes. This algorithm uses the persistent homology. 

• An improvement of the straightforward algorithm using finite field linear algebra. 

• Hardness results concerning the localization of homology classes. 

2. Defining the Problem 

In this section, we provide a technique for ranking homology classes according to their 
importance. Specifically, we solve the first two problems mentioned in Section 1 by formally 
defining (1) a meaningful size measure for homology classes that is computable in arbitrary 
dimension; and (2) an optimal homology basis which distinguishes large classes from small 
ones effectively. 

Since we restrict our work to homology groups over Z2, when we talk about a d- 
dimensional chain, c, we refer to either a collection of d-simplices, or a n^i-dimensional 
vector over Z2 field, whose non-zero entries corresponds to the included d-simplices. is 
the number of d-dimensional simplces in the given complex, K. The relevant background 
in homology and relative homology can be found in [l6] . 

The Discrete Geodesic Distance. In order to measure the size of homology classes, we 
need a notion of distance. As we will deal with a simplicial complex it is most natural 
to introduce a discrete metric, and corresponding distance functions. We define the discrete 
geodesic distance from a vertex p G vert(i^), fp : vert(K) Z, as follows. For any vertex 
q G vert(K), fp(q) = dist(p, is the length of the shortest path connecting p and in 
the 1-skeleton of K; it is assumed that each edge length is one, though this can easily be 
changed. We may then extend this distance function from vertices to higher dimensional 
simplices naturally. For any simplex a ^ fp{^) is the maximal function value of the 
vertices of a, fp(o-) = ^^'^qevert{a) fp(Q)- Finally, we define a discrete geodesic ball B^, 
p G vert(K), r > 0, as the subset of — {a ^ K \ fp{cr) < r}. It is straightforward 

to show that these subsets are in fact subcomplexes, namely, subsets that are still simplicial 
complexes. 

2.1. Measuring the Size of a Homology Class 

We start this section by introducing notions from relative homology. Given a simplicial 
complex K and a subcomplex L C we may wish to study the structure of K by ignoring 
all the chains in L. We study the group of relative chain as a quotient group, Cd{K^L) = 
Cd{K)/Cd{L), whose elements are relative chains. Analogous to the way we define the group 
of cycles Zd{K), the group of boundaries Bd{K) and the homology group Hd{K) in Cd{K)^ we 
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Figure 2: (a) On a disk with three holes, the three shaded regions are the three smallest geodesic 
balls measuring the three corresponding classes, (b) On a tube, the smallest geodesic ball 
is centered at not qi. 

define the group of relative cycles, the group of relative boundaries and the relative homology 
group in Cd{K,L), denoted as Zd{K,L), Bd{K,L) and ]r\d{K,L), respectively. We denote 

• C,d(K) Cd{K,L) as the homomorphism mapping d-chains to their corresponding 
relative chains, 02 : ^d{K) [-\d{K,L) as the induced homomorphism mapping homology 
classes of K to their corresponding relative homology classes. 

Using these notions, we define the size of a homology class as follows. Given a simplicial 
complex K, assume we are given a collection of subcomplexes C = {L C K}. Furthermore, 
each of these subcomplexes is endowed with a size. In this case, we define the size of 
a homology class h as the size of the smallest L carrying h. Here we say a subcomplex 
L carries h if h has a trivial image in the relative homology group H^(i^, L), formally, 
^l(^) ~ ^d{K,L). Intuitively, this means that h disappears if we delete L from K, by 
contracting it into a point and modding it out. 

Definition 2.1. The size of a class h, S{h), is the size of the smallest measurable subcom- 
plex carrying /i, formally, S{h) = min^,^/: size(L) such that (t)\{h) — Bd{K,L). 

We say a subcomplex L carries a chain c if L contains all the simplices of the chain, 
formally, c C L. Using standard facts from algebraic topology, it is straightforward to see 
that L carries h if and only if it carries a cycle of h. This gives us more intuition behind 
the measure definition. 

In this paper, we take C to be the set of discrete geodesic balls, C = {B^ \ p E 
vert(K),r > 0}J!| The size of a geodesic ball is naturally its radius r. The smallest geodesic 
ball carrying h is denoted as Bmin{h) for convenience, whose radius is S{h). In Figure [2(a)j 
the three geodesic balls centered at pi, p2 and ps are the smallest geodesic balls carrying 
nontrivial homology classes [zi], [2:2] and [2:3], respectively. Their radii are the size of the 
three classes. In Figure [2(b)j the smallest geodesic ball carrying a nontrivial homology class 
is the pink one centered at ^2, not the one centered at qi. Note that these geodesic balls 
may not look like Euclidean balls in the embedding space. 

2.2. The Optimal Homology Basis 

For the d-dimensional Z2 homology group whose dimension (Betti number) is f3d, there 
are 2^^ — 1 nontrivial homology classes. However, we only need (3d of them to form a basis. 



The idea of growing geodesic discs has been used in [19 . However, this work depends on low dimensional 
geometric reasoning, and hence is restricted to 1-dimensional homology classes in 2-manifold. 
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The basis should be chosen wisely so that we can easily distinguish important homology 
classes from noise. See Figure 1(c) for an example. There are 2^ — 1 — 7 nontrivial homology 
classes; we need three of them to form a basis. We would prefer to choose {[zi]^ [2:2], [^3]} 
as a basis, rather than {[zi] + [2:2] + [2:3], [2:2] + [^3], [^3]}- The former indicates that there is 
one big cycle in the topological space, whereas the latter gives the impression of three large 
classes. 

In keeping with this intuition, the optimal homology basis is defined as follows. 

Definition 2.2. The optimal homology basis is the basis for the homology group whose 
elements' size have the minimal sum, formally, 

Hd= argmin ^ ^(/li), dim({/ii, /i^^}) = Z^^. 
i=i 



This definition guarantees that large homology classes appear as few times as possible 
in the optimal homology basis. In Figure [T(c)[ the optimal basis will be {[2:1], [2:2], [2:3]}, 
which has only one large class. 

For each class in the basis, we need a cycle representing it. As we has shown, 
the smallest geodesic ball carrying /i, carries at least one cycle of h. We localize each class 
in the optimal basis by its localized- cycles^ which are cycles of h carried by Bmin(h). This 
is a fair choice because it is consistent to the size measure of h and it is computable in 
polynomial time. See Section [5] for further discussions. 



3. The Algorithm 

In this section, we introduce an algorithm to compute the optimal homology basis as 
defined in Definition 12. 2[ For each class in the basis, we measure its size, and represent it 
with one of its localized-cycles. We first introduce an algorithm to compute the smallest 
homology class, namely, Measure-Snnallest(i^). Based on this procedure, we provide the 
algorithm Measure-AII(K), which computes the optimal homology basis. The algorithm 
takes 0{f3^n^) time, where f3d is the Betti number for d-dimensional homology classes and 
n is the cardinality of the input simplicial complex K. 

Persistent Homology. Our algorithm uses the persistent homology algorithm. In persis- 
tent homology, we filter a topological space with a scalar function, and capture the birth 
and death times of homology classes of the sublevel set during the filtration course. Classes 
with longer persistences are considered important ones. Classes with infinite persistences 
are called essential homology classes and corresponds to the intrinsic homology classes of 
the given topological space. Please refer to [12l[2Ql|6j for theory and algorithms of persistent 
homology. 



3.1. Computing the Smallest Homology Class 

The procedure Measure-Snnallest(K) measures and localizes, hmim the smallest non- 
trivial homology class, namely, the one with the smallest size. The output of this procedure 
will be a pair (^TT^m, 2^772^71)5 namely, the size and a localized-cycle of hjyiiji. According to 
the definitions, this pair is determined by the smallest geodesic ball carrying /imm? namely. 
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Bmin{hmin)- We first present the algorithm to compute this bah. Second, we explain how 
to compute the pair {Smim ^min) from the computed ball. 

Procedure Bnnin(K): Computing Bmin{hmin)' It is straightforward to see that the ball 
Bmin{hmin) is also the smallest geodesic ball carrying any nontrivial homology class of K. It 
can be computed by computing Bp^^"^ for all vertices p, where Bp^^"^ is the smallest geodesic 
ball centered at p which carries any nontrivial homology class. When all the S^^^^'s are 
computed, we compare their radii, r(p)'s, and pick the smallest ball as Br^inihmin). 

For each vertex p, we compute Bp^^^ by applying the persistent homology algorithm to 
K with the discrete geodesic distance from /p, as the filter function. Note that a geodesic 
ball Bp is the sublevel set f~^{—oo^r] C K. Nontrivial homology classes of K are essential 
homology classes in the persistent homology algorithm. (In the rest of this paper, we may 
use "essential homology classes" and "nontrivial homology classes of interchangable.) 
Therefore, the birth time of the first essential homology class is r(p), and the subcomplex 

/-i(-^,r(p)] is<^\ 

Computing {Smin^ Zmin)' We compute the pair from the computed ball Bjnin(hjnin)- For 
simplicity, we denote Pmin and Tj^in as the center and radius of the ball. According to the 
definition, Tmin is exactly the size of /imm? Smin- Any nonbounding cycle (a cycle that is not 
a boundary) carried by Bjnin{hmin) is a localized-cycle of /immS We first computes a basis 
for all cycles carried by using a reduction algorithm. Next, elements in this 

basis are checked one by one until we find one which is nounbounding in K. This checking 
uses the algorithm of Wiedemann |18| for rank computation of sparse matrices over Z2 field. 

3.2. Computing the Optimal Homology Basis 

In this section, we present the algorithm for computing the optimal homology basis 
defined in Definition 12. 2[ namely, Tid- We first show that the optimal homology basis can 
be computed in a greedy manner. Second, we introduce an efficient greedy algorithm. 

3.2.1. Computing TCd in a Greedy Manner. Recall that the optimal homology basis is the 
basis for the homology group whose elements' size have the minimal sum. We use matroid 
theory [9] to show that we can compute the optimal homology basis with a greedy method. 
Let H be the set of nontrivial d-dimensional homology classes (i.e. the homology group 
minus the trivial class). Let L be the family of sets of linearly independent nontrivial 
homology classes. Then we have the following theorem, whose proof is omitted due to 
space limitations. The same result has been mentioned in [14J. 

Theorem 3.1. The pair {H^L) is a matroid when (3^ > 0. 

We construct a weighted matroid by assigning each nontrivial homology class its size 
as the weight. This weight function is strictly positive because a nontrivial homology class 
can not be carried by a geodesic ball with radius zero. According to matroid theory, we 
can compute the optimal homology basis with a naive greedy method: check the smallest 
nontrivial homology classes one by one, until (3^ linearly independent ones are collected. 

^This is true assuming that Bmin{hmin) carries one and only one nontrivial class, i.e. hmin itself. However, 
it is straightforward to relax this assumption. 
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The collected (3d classes {/i^^, hi^, hi^^} form the optimal homology basis Hd- (Note that 
the /I's are ordered by size, i.e. S{hij^) < S{hij^^^).) However, this method is exponential in 
f3d- We need a better solution. 

3.2.2. Computing Hd with a Sealing Technique. In this section, we introduce a polynomial 
greedy algorithm for computing Hd- Instead of computing the smallest classes one by one, 
our algorithm uses a sealing technique and takes time polynomial in Intuitively, when 
the smallest / classes in Hd are picked, we make them trivial by adding new simplices to the 
given complex. In the augmented complex, any linear combinations of these picked classes 
becomes trivial, and the smallest nontrivial class is the (/ + l)'th one in Hd- 

The algorithm starts by measuring and localizing the smallest homology class of the 
given simplicial complex K (using the procedure Measure-Snnallest(K) introduced in Sec- 
tion [3T]), which is also the first class we choose for Hd- We make this class trivial by sealing 
one of its cycles - i.e. the localized-cycle we computed - with new simplices. Next, we 
measure and localize the smallest homology class of the augmented simplicial complex K\ 
This class is the second smallest homology class in Hd- We make this class trivial again and 
proceed for the third smallest class in Hd- This process is repeated for f3d rounds, yielding 

We make a homology class trivial by sealing the class's localized-cycle, which we have 
computed. To seal this cycle z, we add (a) a new vertex v] (b) a (d + l)-simplex for each 
(i-simplex of z, with vertex set equal to the vertex set of the d-simplex together with v; (c) 
all of the faces of these new simplices. In Figure 3(a) and 3(b)[ a 1-cycle with four edges. 



zi, is sealed up with one new vertex, four new triangles and four new edges. 

It is essential to make sure the new simplices does not influence our measurement. We 
assign the new vertices +oc geodesic distance from any vertex in the original complex K. 
Furthermore, in the procedure Measure-Snnallest(K05 we will not consider any geodesic 
ball centered at these new vertices. In other words, the geodesic distance from these new 
vertices will never be used as a filter function. Whenever we run the persistent homology 
algorithm, all of the new simplices have +oc filter function values, formally, fp{cr) = +oc 
for ah p G vert(K) and a G K^\K. 

The algorithm is illustrated in Figure 3(a) and |3(b)[ The 4-edge cycle, zi, and the 



8-edge cycle, Z2, are the localized-cycles of the smallest and the second smallest homology 
classes (^^([zi]) = 2,S'([z2]) = 4). The nonbounding cycle 2:3 = zi + Z2 corresponds to the 
largest nontrivial homology class [2:3] = [zi] + [2:2] (^'([^3]) = 5). After the first round, we 
choose [zi] as the smallest class in Hi. Next, we destroy [zi] by sealing zi, which yields the 
augmented complex K\ This time, we choose [^2], giving Hi = {[^1], [^2]}- 

Correctness. We prove in Theorem [33] the correctness of our greedy method. We begin by 
proving a lemma that destroying the smallest nontrivial class will neither destroy any other 
classes nor create any new classes. Please note that this is not a trivial result. The lemma 
does not hold if we seal an arbitrary class instead of the smallest one. See Figure 3(c)| and 



|3(d)| for examples. Our proof is based on the assumption that the smallest nontrivial class 
hmin is the only one carried by Bmin{hmin)- 

Lemma 3.2. Given a simplicial complex K , if we seal its smallest homology class hmin{K), 
any other nontrivial homology class of K , h, is still nontrivial in the augmented simplicial 
complex K' . In other words, any cycle of h is still nonbounding in K' . 
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(a) (b) (c) (d) 



Figure 3: (a,b) the original complex K and the augmented complex K' after destroying the smallest 
class, [zi]. (c) If the original complex K consists of the two cycles zi and ^2, destroying a 
larger class [zi] + [Z2] will make all other classes trivial too. (d) The original complex K 
consists of the two cycles and an edge connecting them. Destroying [zi] + [Z2] will make 
all other classes trivial and create a new class. 

This lemma leads to the correctness of our algorithm, namely. Theorem 13.31 We prove 
this theorem by showing that the procedure Measure-AII(K) produces the same result as 
the naive greedy algorithm. 

Theorem 3.3. The procedure Measure-AII(K) computes Hd- 

4. An Improvement Using Finite Field Linear Algebra 

In this section, we present an improvement on the algorithm presented in the previous 
section, more specifically, an improvement on computing the smallest geodesic ball carrying 
any nontrivial class (the procedure Bmin). The idea is based on the finite field linear algebra 
behind the homology. 

We first observe that for neighboring vertices, pi and p2^ the birth times of the first 
essential homology class using /^^ and fp^ as filter functions are close (Theorem 14.11) . This 

observation suggests that for each p, instead of computing Bp^^\ we may just test whether 
the geodesic ball centered at p with a certain radius carries any essential homology class. 
Second, with some algebraic insight, we reduce the problem of testing whether a geodesic ball 
carries any essential homology class to the problem of comparing dimensions of two vector 
spaces. Furthermore, we use Theorem 14.21 to reduce the problem to rank computations 
of sparse matrices on the Z2 field, for which we have ready tools from the literature. In 
what follows, we assume that K has a single component; multiple components can be 
accommodated with a simple modification. 

Complexity. In doing so, we improve the complexity to 0(/3^n^ log^ n). More detailed 
complexity analysis is omitted due to space limitations!! 

Next, we present details of the improvement. In Section [4Tl we prove Theorem 14. II and 
provide details of the improved algorithm. In Section 14. 2[ we explain how to test whether 
a certain subcomplex carries any essential homology class of K. For convenience, in this 
section, we use "carrying nonbounding cycles" and "carrying essential homology classes" 

^ This complexity is close to that of the persistent homology algorithm, whose complexity is O(n^). Given 
the nature of the problem, it seems likely that the persistence complexity is a lower bound. If this is the 
case, the current algorithm is nearly optimal. Cohen-Steiner et al.[8 provided a linear algorithm to maintain 
the persistences while changing the filter function. While interesting, this algorithm is not applicable in our 
case. 
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interchangeably, because a geodesic ball carries essential homology classes of K if and only 
if it carries nonbounding cycles of K. 

4.1. The Stability of Persistence Leads to An Improvement 

Cohen-Steiner et aL[6j proved that the change, suitably defined, of the persistence of 
homology classes is bounded by the changes of the filter functions. Since the filter functions 
of two neighboring vertices, /^^ and fp^^ are close to each other, the birth times of the first 
nonbounding cycles in both filters are close as well. This leads to Theorem 14.11 A simple 
proof is provided. 

Theorem 4.1. // two vertices pi and p2 are neighbors, the birth times of the first non- 
bounding cycles for filter functions fp^ and fp^ differ by no more than 1. 

Proof, pi and p2 are neighbors implies that for any point fp2(Q) ^ fp2(Pi) + fpi(Q) — 
1 + fpi(q), in which the inequality follows the triangular inequality. Therefore, B^lf^^ is a 
subset of Bpi^'^^^. The former carries nonbounding cycles implies that the latter does too, 
and thus r(p2) < r{pi) + 1. Similarly, we have r(pi) < r{p2) -\- I- ■ 

This theorem suggests a way to avoid computing Sp^^^ for all p G vert(K) in the 
procedure Bmin. Since our objective is to find the minimum of the r(p)'s, we do a breadth- 
first search through all the vertices with global variables rjnin and Pmin recording the smallest 
r{p) we have found and its corresponding center p, respectively. We start by applying the 
persistent homology algorithm on K with filter function fp^ , where po is an arbitrary vertex 
of K. Initialize r^m as the birth time of the first nonbounding cycle of r(po), and Pmin 
as pq. Next, we do a breadth- first search through the rest vertices. For each vertex pi,i ^ 0, 
there is a neighbor pj we have visited (the parent vertex of pi in the breath- first search tree). 
We know that r{pj) > rmin and r(pi) > r{pj) — l (Theorem l4.ip . Therefore, r{pi) > rmin — ^- 
We only need to test whether the geodesic ball carries any nonbounding cycle of 

K. If so, rjnin IS decremented by one, and Pmin is updated to pi. After all vertices are 
visited, Pmin and rmin give us the ball we want. 

However, testing whether the sub complex carries any nonbounding cycle of K 

is not as easy as computing nonbounding cycles of the subcomplex. A nonbounding cycle 
of^Tmin-i ixiay not be nonbounding in K as we require. For example, in Figure 4(a) | and 



4(b)[ the simplicial complex K is a torus with a tail. The pink geodesic ball in the first 



figure does not carry any nonbounding cycle of although it carries its own nonbounding 
cycles. The geodesic ball in the second figure is the one that carries nonbounding cycles of 
K. Therefore, we need algebraic tools to distinguish nonbounding cycles of K from those 
of the subcomplex B^^^^"-*^. 

4.2. Procedure Contain-Nonbounding-Cycle: Testing Whether a Subcomplex Car- 
ries Nonbounding Cycles of K 

In this section, we present the procedure for testing whether a subcomplex Kq carries 
any nonbounding cycle of K. A chain in Kq is a cycle if and only if it is a cycle of K. 
However, solely from Kq^ we are not able to tell whether a cycle carried by Kq bounds or 
not in K. Instead, we write the set of cycles of K carried by Kq^ Z^°(K), and the set of 
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(a) (b) (c) (d) 



Figure 4: (a,b) In a torus with a tail, only the ball in the second figure carries nonbounding cycles 
of although in both figures the balls have nontrivial topology. (c,d) The cycles with 
the minimal radius and the minimal diameter, Zr and Zd (Used in Section [5]). 

boundaries of K carried by ^^^{^)^ sets of linear combinations with certain con- 
straints. Consequently, we are able to test whether any cycle carried by is nonbounding 
in K by comparing their dimensions. Formally, we define B^°(K) = B(^(K) H C(i(Ko) and 

(i^) = Zd(K)nQ(Ko). 

Let Hd — [2:1, Zf^^] be the matrix whose column vectors are arbitrary jS^ nonbounding 
cycles of K which are not homologous to each other. The boundary group and the cycle 
group of K are column spaces of the matrices ddJ^i ^iid Z^t — [S^i+i, Hd\^ respectively. Using 
finite field linear algebra, we have the following theorem, whose proof is omitted due to 
space limitations. 

Theorem 4.2. i^o carries nonbounding cycles of K if and only if 

rank(Z,^\^°) - rank(9f+\^°) ^ 

where 5^^-,^ and are the i-th rows of the matrices dd^i and Z^, respectively. 

We use the algorithm of Wiedemann[18j for the rank computation. In our algorithm, 
the boundary matrix dd^i is given. The matrix Hd can be precomputed as follows. We 
perform a column reduction on the boundary matrix dd to compute a basis for the cycle 
group Zd{K). We check elements in this basis one by one until we collect f3d of them forming 
Hd- For each cycle z in this cycle basis, we check whether z is linearly independent of the 
(i-boundaries and the nonbounding cycles we have already chosen. More details are omitted 
due to space limitations. 

5. Localizing Classes 

In this section, we address the localization problem. We formalize the localization 
problem as a combinatorial optimization problem: Given a simplcial complex compute 
the representative cycle of a given homology class minimizing a certain objective function. 
Formally, given an objective function defined on all the cycles, cost : Zd{K) R, we want 
to localize a given class with its optimally localized cycle, Zopt{h) = argmin^^^ cost(2:). In 
general, we assume the class h is given by one of its representative cycles, zq. 

We explore three options of the objective function cost(2:), i.e. the volume, diameter and 
radius of a given cycle z. We show that the cycle with the minimal volume and the cycle 
with the minimal diameter are NP-hard to compute. The cycle with the minimal radius. 
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which is the locahzed-cycle we defined and computed in previous sections, is a fair choice. 
Due to space hmitations, we omit proofs of theorems in this section. 

Definition 5.1 (Volume). The volume of z is the number of its simplices, vol(z) = card(z). 

For example, the volume of a 1-dimensional cycle, a 2-dimensional cycle and a 3- 
dimensional cycle are the numbers of their edges, triangles and tetrahedra, respectively. 
A cycle with the smallest volume, denoted as z^, is consistent to a "well-localized" cycle 
in intuition. Its 1-dimensional version, the shortest cycle of a class, has been studied by 
researchers [HI [191 [H] . However, we prove in Theorem 15.21 that computing of h is NP- 
hardS The proof is by reduction from the NP-hard problem MAX-2SAT-B [IT]. More 
generally, we can extend the the volume to be the sum of the weights assigned to simplices 
of the cycle, given an arbitrary weight function defined on all the simplices of K. The 
corresponding smallest cycle is still NP-hard to compute. 

Theorem 5.2. Computing Zy for a given h is NP-hard. 

When it is NP-hard to compute z^, one may resort to the geodesic distance between 
elements of z. The second choice of the objective function is the diameter. 

Definition 5.3 (Diameter). The diameter of a cycle is the diameter of its vertex set, 
diam(2:) = diam(vert(z)), in which the diameter of a set of vertices is the maximal geodesic 
distance between them, formally, diam(S') = max^^^^^* dist(p, g). 

Intuitively, a representative cycle of h with the minimal diameter, denoted z^, is the 
cycle whose vertices are as close to each other as possible. The intuition will be further 
illustrated by comparison against the radius criterion. We prove in Theorem 15.41 that com- 
puting Zd of h is NP-hard, by reduction from the NP-hard Multiple- Choice Cover Problem 
(MCCP) of Arkin and Hassin [2]. 

Theorem 5.4. Computing z^ for a given h is NP-hard. 

The third option of the objective function is the radius. 

Definition 5.5 (Radius). The radius of a cycle is the radius of the smallest geodesic ball 
carrying it, formally, rad(2:) = ^^^pevert{K) i^^^gGvert(z) dist(^, g), where vert(i^) and vert(2:) 
are the sets of vertices of the given simplicial complex K and the cycle z, respectively. 

The representative cycle with the minimal radius, denoted as Zj- , IS the same as the 
localized-cycle defined and computed in previous sections. Intuitively, Zr is the cycle whose 
vertices are as close to a vertex of K as possible. However, Zr may not necessarily be 
localized in intuition. It may wiggle a lot while still being carried by the smallest geodesic 
ball carrying the class. See Figure |4(c)[ in which we localize the only nontrivial homology 
class of an annulus (the light gray area). The dark gray area is the smallest geodesic ball 
carrying the class, whose center is p. Besides, the cycle with the minimal diameter (Figure 
|4(d)[ ) avoids this wiggling problem and is concise in intuition. This in turn justifies the 
choice of diameterH We can prove that Zr can be computed in polynomial time and is a 
2-approximation of z^- 

"^Erickson and Whittlesey [H] localized 1-dimensional classes with their shortest representative cycles. 
Their polynomial algorithm can only localize classes in the shortest homology basis, not arbitrary given 
classes. 

^This figure also illustrates that the radius and the diameter of a cycle are not strictly related. For the 
cycle Zr in the left, its diameter is twice of its radius. For the cycle Zd in the center, its diameter is equal to 
its radius. 
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Theorem 5.6. We can compute Zr in polynomial time. 

Theorem 5.7. diam(2:^) < 2didim(zd). 

This bound is a tight bound. In Figure |4(c)| and |4(d)[ the diameter of the cycle Zr is 
twice of the radius of the dark gray geodesic bah. The diameter of the cycle z^ is the same 
as the radius of the ball. We have diam(2:^) = 2diam(2:^). 
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